Nonnative speech recognition based on state-candidate bilingual model modification
نویسندگان
چکیده
The speech recognition accuracy has been observed to decrease for nonnative speakers, especially those who are just beginning to learn foreign language or who have heavy accents. This paper presents a novel bilingual model modification approach to improve nonnative speech recognition, considering these great variations of accented pronunciations. Each state of the baseline nonnative acoustic models is modified with several candidate states from the auxiliary acoustic models, which are trained by speakers’ mother language. State mapping criterion and n-best candidates are investigated based on a grammar-constrained speech recognition system. Using the state-candidate bilingual model modification approach, compared to the nonnative acoustic models which have already been well trained by adaptation technique MAP, a Relative reduction of 7.87% in Phrase Error Rate (RPhrER) was further achieved.
منابع مشابه
Development of a Mandarin-English Bilingual Speech Recognition System with Unified Acoustic Models
This paper presents our recent work on the development of a grammar-constrained, Mandarin-English bilingual Speech Recognition System (MESRS) for real-world music retrieval. Two of the main difficult issues in handling the bilingual speech recognition for realworld applications are tackled: One is to balance the performance and the complexity of the bilingual speech recognition system; the othe...
متن کاملA Two-stage Speaker Adaptation Approach for Subspace Gaussian Mixture Model based Nonnative Speech Recognition
Nonnative speech recognition is becoming more and more important as many speech applications are deployed world wide. Meanwhile, due to the large population of nonnative speakers, speaker adaptation remains the most practical way for providing high performance speech services. Subspace Gaussian Mixture Model (SGMM) has recently been shown to yield superior performance on various native speech r...
متن کاملRecognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model
Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....
متن کاملOnline Unsupervised Multilingual Acoustic Model Adaptation for Nonnative Asr
Automatic speech recognition (ASR) is currently one of the main research interests in computer science. Hence, many ASR systems are available in the market. Yet, the performance of speech and language recognition systems is poor on nonnative speech. The challenge for nonnative speech recognition is to maximize the accuracy of a speech recognition system when only a small amount of nonnative dat...
متن کاملRecent Progress in the Decodin with Multilingual Aco
In this paper we report on recent progress in the use of multilingual Hidden Markov Models for the recognition of non-native speech. While we have previously discussed the use of bilingual acoustic models and recognizer combination methods, we now seek to avoid the increased computational load imposed by methods such as ROVER by focusing on acoustic models that share training data from 5 langua...
متن کامل